A Learning Technique to Determine Criteria for Multiple Document Summarization
نویسندگان
چکیده
In this paper we describe a new method of automatic summarization based on a learning step to identify criteria that maximize the correlation between human summary and peer extract. The proposed method uses a genetic algorithm to produce extracts from a collection of source documents describing the same event. Theses extracts are compared to human summaries using “Rouge measure” in order to identify the correlation between statistical and linguistic criteria and “Rouge score”. The experiment Results are presented for a document set extracted from the DUC’06 evaluation conference.
منابع مشابه
A survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملA Hybrid Approach for Extractive Document Summarization Using Machine Learning and Clustering Technique
Usually, presence of the same information in multiple documents is the main problem faced in effective information access. Instead of this redundant information thus accessed or retrieved, users are interested in retrieving information that addresses one or other several aspects. In such situation, text summarization proves to be very useful. Not only in Information retrieval, but it is an extr...
متن کاملA Bottom-Up Approach to Sentence Ordering for Multi-Document Summarization
Ordering information is a difficult but important task for applications generating natural-language text. We present a bottom-up approach to arranging sentences extracted for multi-document summarization. To capture the association and order of two textual segments (eg, sentences), we define four criteria, chronology, topical-closeness, precedence, and succession. These criteria are integrated ...
متن کاملFeature expansion for query-focused supervised sentence ranking
We present a supervised sentence ranking approach for use in extractive summarization. Using a general machine learning technique provides great flexibility for incorporating varied new features, which we demonstrate. The system proves quite effective at query-focused multi-document summarization, both for single summaries and for series of update summaries.
متن کاملA Cluster Based Keyword Filtration Approach for Web Document Summarization
Summarization, an extremely important technique in Data Mining is an automatic learning technique aimed to extract the most valuable information from a large size document or the articles. The goal is to create the summary of the document, but substantially different from each other. Text Document summarization refers to the summarization of text documents based upon their content. The proposed...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008